U9  10^-0¥ 


I 


Coordinates,  Conversions,  and  Kinematics 
for  the  Rochester  Robotics  Lab 

t 

Christopher  M,  Brown  Raymond  D,  Pdmey 

Technical  Report  259 
Aliccust  1988 

DTIO 

ELECTE 
NOV  0  8  B88 


UNIVERSITY  OF 


ROCHESTER 

COMPUTER  SCIENCE 


8  g  11  07  090 


Coordinates,  Conversions,  and  Kinematics 

for  the 

Rochester  Robotics  Lab 

Christopher  M.  Brown  and  Raymond  D.  Rimey 
TR  259 
August  1988 


Abstract 

This  is  a  guide  to  coordinate  systems,  representations,  and  geometric 
relationships  between  them,  for  components  of  the  Rochester  Robotics 
Laboratory.  The  main  entities  at  issue  are  the  joint  angles,  location 
variables,  and  coordinate  systems  of  the  Puma,  the  camera  angles  and 
coordinate  systems  associated  with  the  head,  the  spatial  location  of 
three-dimensional  points,  and  the  kinematic  and  inverse  kinematic 
relationships  between  them.  The  robot-to-camera  kinematic  chain  is 
described,  conversions  between  homogeneous  transformations  and  VAL 
location  descriptions  are  provided,  and  inverse  problems  (camera 
angles  to  aim  cameras  at  a  3-D  point  given  a  robot  configuration, 
binocular  stereo  calculations)  are  solved.  Constants  describing  the 
robot  head  and  sample  robot  description  data  structures  are  provided.  (  |< 
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description  data  structures  are  provided. 
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1.  Purpose 

The  purpose  of  this  document  is  to  relate  elementary  kinematic  concepts  and 
calculations  to  the  University  of  Rochester  Robotics  Laboratory,  with  the  aim  of  making 
certain  aspects  of  the  Puma  arm  and  the  two-camera  head  easier  to  use  and  understand. 
In  using  the  robot  to  interact  with  the  world,  one  quickly  finds  a  potentially  bewildering 
set  of  coordinate  systems,  angles,  parameters,  and  state  descriptions  that  must  be  related 
one  to  another  in  order  to  produce  coherent  robot  actions  and  to  answer  common  robotic 
questions.  This  document  mainly  concerns  the  definition  and  manipulation  of  coordinate 
systems.  The  semantics  of  the  coordinates  are  such  things  as  tool  positions,  camera 
orientations,  and  so  forth. 

Section  two  presents  a  short  section  on  transformation  notatation  and  properties, 
which  should  be  r^.  There  follows  a  glossary  of  scalars,  vectors,  and  transformations 
we  use  later  in  the  document,  which  can  be  skimmed  and  referred  to  as  needed. 

Section  3  defines  some  important  robotic  coordinate  systems.  LAB  is  the  base 
coordinate  system  attached  to  the  laboratory.  TOOL  describes  the  location  of  the  robot 
head.  FLANGE  is  another  description  for  head  location,  but  one  more  convenient  for  use 
with  imaging  operations. 

Section  4  describes  the  model  of  the  imaging  process.  Section  5  describes  and 
defines  the  transformations  along  the  chain  of  links  that  the  Puma  and  the  head  embody. 

There  follow  several  sections,  each  one  describing  how  to  convert  from  one 
representation  to  another,  or  deriving  a  desired  transformation  or  description  from  a 
specification.  For  example,  it  may  be  of  interest  to  conven  the  T(X)L  coordnate  syster  i 
into  the  description  used  by  VAL  for  robot  location.  An  example  of  deriving  an 
interesting  configuration  is  to  find  the  camera  altitude  and  azimuth  angles  that  the 
camera  at  a  point  in  (X,Y,Z)  space. 

There  is  room  for  expansion  of  this  report,  and  an  expanded  version  should  be 
produced  later  when  more  is  known.  One  obvious  lack  at  present  is  the  Jacobian 
calculations  for  the  head  --  at  what  rate  to  move  the  cameras  to  compensate  continuously 
for  continuous  head  motion  and  vice-versa. 

2.  Transform  Basics 

We  follow  [Paul  1981]  and  represent  3- space  points  as  column  homogeneous  4- 
vectors  or  column  Cartesian  3-vectors.  Transforms  (and  coordinate  systems,  or  CS’s), 
are  homogeneous  4x4  matrices.  All  transforms  but  the  camera  transform  are  rigid.  Thus 
they  denote  a  rigid  rotation  or  translation  or  both.  If  both,  then  think  of  the  rotation  as 
being  done  before  the  translation.  A  transform  B  operates  on  points  expressed  as  column 
vectors  to  yield  new  points.  A  transfonn  B  represents  a  CS  in  that  it  can  be  thought  of  as 
four  columns,  three  of  which  represent  points  at  infinity  and  correspond  to  directions  of 
the  X,Y,  and  Z  axes  of  a  CS,  and  the  last  of  which  represents  a  3-space  point  and 
corresponds  to  the  origin  of  the  CS.  Transforming  (that  is,  multiplying)  a  CS  by  a 
transform  just  rigidly  moves  the  CS  in  space.  LAB  is  the  identity  transform,  and 
transforming  LAB  by  B  yields  B.  Thus  B  cleverly  represents  a  coordinate  system  and  a 
transform  that  moves  LAB  to  that  coordinate  system. 

If]? is  a  vector  denoting  a  point  in  LAB  coordinates,  and  A  and  B  are  transforms, 
then  Bj?  gives  the  coordinates  of  ("is")  the  point,  in  LAB  coordinates,  that  results  from 
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rotating  and  translating  j^by  B.  AB3?is  the  point  resulting  from  applying  B  to  x,  then  A 
to  the  result,  where  rotating  and  translating  by  A  is  done  with  respect  to  the  original  LAB 
coordinate  system.  Alternatively,  AB!?  means  applying  to  x  the  transformation  A, 
followed  by  the  transform  B  expressed  in  the  frame  A.  is  conceptualized  as  taking  place 
in  the  cooniinate  system  induced  by  all  previous  movements,  the  final  transform  is 
A1A2  '  ‘  *  A„  if  1  is  the  link  connected  to  LAB.  Ifj^is  a  point  in  LAB  and  B  is  a  frame, 
then  ^expressed  in  B  is  B~^  Last,  if  B  takes  LAB  to  the  CS  B  (its  alibi  aspect),  then  B 
also  is  the  transform  from  B  coordinates  to  LAB  coordinates  (its  alias  aspect). 

2.1.  Scalars 

VAL  expresses  distance  in  mm,  angles  in  degrees.  Parameters  to  our  subroutines 
are  expressed  in  radians,  to  save  conversions.  Since  users  often  prefer  degrees,  our  user 
interfaces  (and  this  document)  usually  express  angles  in  degrees. 

f  Effective  imaging  system  focal  length  (including  digitizing  effects). 

(Note  that  this  number  has  nothing  to  do  with  the  physical  focal  length  of 
the  lens). 

s  Camera  aspect  ratio:  column  spacing  /  row  spacing. 

0  Camera  platform  altitude  angle  in  radians. 

0  Generic  azimuth  angle. 

0/,  (Head’s)  Left  Camera  azimuth  angle. 

0/;  Right  Camera  azimuth  angle. 

2.2.  Vectors 

Pixel  coordinates  in  MaxVideo  routines  are  expressed  as  (pixel  offset  in  scanline, 
scanline),  which  corresponds  to  the  "physical"  (x,y)  coordinate  scheme  used  in  the 
imaging  model  coordinate  system,  which  has  Y  axis  down  and  X  axis  to  the  right.  In 
array  indexing,  however,  the  "natural"  element-addressing  scheme  is 
Image[row][column],  which  has  semantics  (y,x).  Where  this  document  uses  pixel 
coordinates,  they  are  expressed  in  the  physical  system  rather  than  in  the  array-indexing 
system. 

it  generic  vector  (x,y,z)^,  a  point  in  space  expressed  in  some  coordinate 

system.  Often  the  homogeneous  column  4- vector  (x,y,z,w)^ 

{x,y)  image  (pixel)  coordinates  of  a  point. 

(0,A,T)  "Orientation,  Altitude,  and  Twist"  angles  describing  the  orientation  of 
T(X)L  axes  in  terms  of  LAB.  Like  Euler  angles  but  not  (see  below). 

Loc  (X,Y,Z,0,A,T).  A  generic  location  (X,Y,Z  position  and  orientation)  used 

by  VAL.  May  be  relative.  In  this  document  we  usually  construe  Loc  to 
define  the  location  of  the  TOOL  CS  in  terms  of  LAB. 

Head  (tj),  0^,  ^n),  (Damera  rotations  defining  the  head  configuration. 

JntAngs  six  angles  defining  the  rotations  of  the  robot  links.  Used  by  VAL  software 
as  "precision  points". 


2.3.  Matrices,  Coordinate  Systems,  and  Transforms 


Sij  The  / row,  column  element  of  S. 

A,  An  "A  Matrix",  expressing  the  rigid  transform  induced  by  one  link  in  a 

kinematic  chain. 


T; 

Rot_x(a) 

Rot_y(a) 

Rot_z(a) 


A  transform  induced  by  a  kinematic  chain.  By  definition, 
Tj=AiA2  •'•Aj. 

Rotation  around  X  axis  by  angle  a. 

Similar 

Similar 


Trans(x,y,z)  Translate  by  x,y,z. 


LAB  CS  attached  to  the  laboratory,  defined  below. 

TOOL  A  user-definable  coordinate  system  rigidly  attached  to  joint  6  of  the  Puma. 

NULLTOOL  VAL’s  default  value  for  TOOL.  It  corresponds  to  a  relative  location  of 
(X,Y,Z,0,A,T)  =  (0,0,0,90,-90,0). 


FLANGE  CS  convenient  for  head  and  camera  calculations.  I.ike  another  TOOL  CS, 
it  is  rigidly  attached  to  joint  6.  Defined  in  terms  of  NULLTOOL  or  - 
relative  to  76  it  has  a  Location  of  (0,0,0,-180,0,-90)  (see  below). 


C 

CamPos 


PhysPixel 


A  perspective  camera  transform,  not  a  rigid  CS  transform. 

FLANGE  transformed  so  its  Z  axis  points  along  a  camera's  optic  axis  and 
its  origin  is  at  the  front  principal  point  of  the  lens.  CamPosL  and 
CamPosR  are  for  left  and  right  cameras. 

Translates  pixel  coordinates  so  origin  is  upper  left  comer  of  pixel  array, 
not  its  middle. 


3.  Coordinate  Systems  and  Robot  Coordinates 
3.1.  LAB 

The  Puma’s  internal  representations  assume  that  its  first  link  is  rigidly  attached  to  a 
LAB  coordinate  system.  VAL  generally  reports  locations  in  LAB  coordinates.  The 
Puma’s  BASE  coordinate  system  is  usually  synomynous  with  LAB.  BASE  may  be 
changed  by  invoking  the  VAL  BASE  command,  which  allows  translation  of  the  X,Y,Z, 
origin  and  Z-rotation  of  BASE.  An  automatic  Z  rotation  may  be  a  good  idea,  since  the 
Puma  is  bolted  slightly  askew. 

At  initialization,  according  to  the  manual,  the  origin  of  LAB  is  at  the  intersection  of 
Joints  1  and  2.  Certainly  the  origin  is  somewhere  near  the  centerline  of  the  Puma. 
Imagine  you  are  looking  in  at  the  Puma  through  the  window.  Then  in  the  default  state, 
the  LAB  X  axis  points  to  your  right  parallel  to  the  window,  the  Y  axis  is  pointing  a>^ 
from  you  toward  the  far  wall,  and  Z  is  up  (Fi^  1).  If  all  joint  angles  are  0  (JntAng  =  0), 
TOOL  =  NULLTOOL,  and  Head  =  (0,0,0)  =  0,  the  cameras  are  pointing  away  from  you 
down  TOOL  (and  LAB)  Y. 


3.2.  76  and  TOOL 

Ti  is  a  CS  attached  to  the  end  of  the  last  link  of  the  robot.  The  TOOL  CS  is 
defined  relative  to  76-  76,  TOOL,  and  FLANGE  share  a  conunon^origin.  76  is  only 
useful  to  understand  the  TOOL  coordinate  system.  When  JntAng  =  0  (Fig.  1),  76  has  its 
X  axis  pointing  down  (along  LAB  -Z).  its  Y  axis  pointing  along  LAB  X,  and  its  Z  axis 
along  LAB  -Y  (Fig.  1). 

The  TOOL  coordinate  system  is  a  transformation  of  7 6,  and  is  a  primitive  notion  in 
VAL  commands,  which  can  often  be  expressed  in  TCXDL  coordinates.  Upon 
initialization,  TOp^=  NULLTOOL,  which  (Fig.  1)  makes  simply  a  translation  of 
LAB  if  JntAng  =  0.  VAL  reports  the  location  of  the  T(X)L  CIS  in  the  form  Loc  = 
(X,Y,Z,0,A,T).  NULLT<X)L  corresponds  to  a  relative  location,  with  respect  to  76,  of 


Zt 


Figure  1:  The  Puma  and  head,  as  seen  from  the  observation  window,  showing  three  basic 
CSs.  The_j>rigins  of  T^i^NULLTOOL  and  FLANGE  coincide.  The  robot  is  shown  with 
JntAng  =  0  and  Head  =  0.  In  this  configuration  Loc  =  (650,190,975,90,-90,0). 


(0,0,0,90,  -90,  0).  The  0,A,T  components  of  Loc  are  angles  that  have  the  following 
semantics.  "Rotate  by  -O  around  (the  current)  X,  then  by  A  around  the  new  Y,  then  by  T 
aroimd  the  new  Z".  Thus  they  have  the  same  flavor  as  Eule^  angles.  It  is  easy  to  verify 
that  applying  the  NULLTOOL  transform  to  Tg  at  JntAng  =  0  transforms  r6  to  have  axes 
parallel  to  LAB. 

One  interesting  and  useful  aspect  of  TOOL  is  that  it  can  be  redefined  by  the  user. 
For  instance,  one  can  redefine  tool  as  a  remote  point,  such  as  a  world  point  that  is 
currently  in  view.  Then  it  is  possible  to  issue  VAL  commands  that  rotate  the  robot  head 
around  the  TOOL  origin.  TTie  effect  is  for  the  head  to  move  in  space  and  to  be 
continuously  reoriented  by  the  robot  wrist  (not  the  camera  motors)  so  Aat  the  cameras 
remain  pointed  at  the  same  three-dimensional  scene  point 

3.3.  FLANGE 

The  head  is  rigidly  attached  to  the  sixth^robot  link,  and  hence  to  Tg  and  TOOL. 
When  the  eyes  arc  facing  "forward"  (Head  =  0),  FLANGE  is  a  coordinate  system  whose 
axes  are  oriented  to  be  consistent  with  the  camera  imaging  model  (Fig.  1).  In  FLANGE, 
Z  is  ouL|ilong  the  direction  the  head  is  facing  (parallel  to  the  optic  axis  of  cameras  if 
Head  =  0).  Y  is  down,  increasing  with  the  row  number  addresses  of  pixels  in  an  image, 
and  X  is  "to  the  right",  increasing  with  the  column  numbers  of  pixel  addresses  in  the 
image.  One  common  trick  is  to  define  TOOL  as  FLANGE.  This  renders  the  explicit 
FLANGE  transform  unnecessary.  If  the  TOOL  transform  is  set  to  (X,Y,Z,0,A,T)  = 
(0,0,0,-180,  0,  -90),  then  Tg  is  transformed  to  FLANGE  by  the  TOOL  transform  within 
the  Puma. 

4.  Camera  Imaging  Model 

Here  we  are  concerned  with  the  "intrinsic"  camera  parameters  [Tsai  1986],  which 
govern  its  optical  properties.  "Extrinsic"  properties  define  its  location  in  space,  and 
determine  the  CainPos  coordinate  system.  These  properties  are  determined  by  the 
kinematic  issues  discussed  below.  For  intrinsic  camera  properties  we  use  a  pinhole 
model  (e.g.  [Duda  and  Hart  1973]),  which  is  to  say  we  do  not  correct  for  radial  lens 
distortions.  This  is  not  a  policy,  it  is  just  that  we  have  not  yet  been  motivated  to  do  so. 

The  camera  optic  axis  is  out  along  the  positive  Z  axis.  Looking  out  along  the 
camera’s  line  of  sight,  Y  points  down,  increasing  as  does  the  scan-line  number  in  the 
camera’s  image.  X  point  to  the  right,  increasing  as  the  pixel  number  along  a  scan  line. 
X,Y,Z  form  a  right-handed  coordinate  system.  We  assume  the  origin  of  coordinates  is  at 
the  camera’s  front  principal  point,  and  that  the  image  is  formed  at  a  distance  / in  front  of 
the  origin  in  the  X-Y  plane  by  point  projection.  TTien  a  scene  point!? yields  the  image 
point  coordinates 

z  z 

where  /  is  the  effective  focal  length  of  the  entire  imaging,  transmission,  digitization,  and 
ROIStoring  process,  and  r  is  a  scaling  constant  that  expresses  the  "aspect  ratio"  of  the 
system.  The  angular  (spatial)  resolution  of  the  final  pixels  resting  in  ROI  is  less  in  the  Y 
direction,  and  s  tells  by  how  much.  Thus  the  model  tells  where  a  point  appears  (under 
default  setings)  in  ROI  store,  and  under  default  settings  where  its  location  is  reported  by 
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FeatureMax.  It  includes  all  effects  induced  by  CCD  chip  layout,  conversion  to  analog 
waveform  by  the  Panasonic  electronics,  sampling  by  DigiMax,  and  storage  in  ROI.  It 
does  not  pretend  to  say  anything  about  any  of  these  effects  in  isolation. 

The  parameters  /  and  s  were  estimated  using  a  calibration  chart,  and  the  values  in 
the  "Constants"  section  represent  our  best  current  estimates. 

The  camera  transform  can  be  expressed  as  multiplying  the  homogeneous  scene 
point  vector  by  a  transform  matrix  C  and  then  performing  normalization  [Tsai  86].  The 
normalization  operation  scales  a  homogeneous  4-vector  (x,y,z,w)^  by  (1/w).  In  this 
context,  the  resulting  value  of  z  is  an  artifact,  since  the  image  has  only  two  dimensions. 
The  matrix  C  is 

70  0  O' 

r  -  0  fs  0  0 

^  ~  0  0  /  0 

.0  0  1  0  . 

The  camera  extrinsic  properties  are  determined  by  the  LAB-Camera  kinematic 
chain  discussed  next. 

5.  The  LAB-Camera  Kinematic  Chain 

There  are  some  fourteen  identifiable  transforms  between  LAB  and  a  CamPos 
Coordinate  System  (Table  1).  The  head  transforms  can  be  collapsed  into  two  link 
transforms  involving  offsets  and  one  rotation  each  [Paul  81],  but  in  this  treatment  all  the 
transforms  beyond  Ag  are  pure  rotations  or  translations.  The  Joint  1-6  transforms  Aj-Ag 
generally  involve  both  offsets  and  rotations. 

Define  T,  as  A1A2  •  •  •  A,.  These  transforms,  their  partial  products,  and  the  inverses  of 
their  partial  products,  are  of  use  in  everyday  robotic  life.  For  instance,  Tg  converts  points 
expressed  in  FLANGE  coordinates  (often  the  output  of  vision  routines  is  in  FLANGE) 
into  LAB.  As  another  example,  to  simulate  making  an  image  with  a  camera,  a  point  in 
LAB  must  be  transformed  by  Tjj  in  order  for  the  camera  imaging  model  to  apply. 

Note  that  T7  is  implemented  internally  in  VAL.  We  can  only  ask  or  set  the  value  of 
T7,  (not  Aj  ~  A7).  Ag  -  Ai4  are  transforms  that  are  created  and  manipulated  by  the 
user.  Thus  we  can  describe  A7  and  Ag  as  follows. 

A7  NULLTOOL  or  set  by  user. 

Ag  Rot_x(-90)  if  TOOL  =  NULLTOOL,  Identity  if  T(X)L  =  FLANGE. 

There  are  two  CamAxis  transforms,  corresponding  to  the  offsets  of  left  and  right 
cameras  along  the  camera  platform.  LlamAxisL  and  R  are  given  special  names  only 
because  they  are  the  last  head  points  that  are  rigidly  affixed  to  FLANGE.  Thus  they  may 
offer  some  efficiency  for  position  evaluations  when  the  eyes  are  moving  but  the  head 
(robot)  is  not.  From  this  point  on  there  are  two  kinematic  chains,  corresponding  to  the 
differing  offsets  and  azimuth  angles  of  the  two  cameras,  and  denoted  by  L  and  R.  We 
often  group  Ag  •  •  •  A14  into  a  single  transform  matrix  (named  C^amPosL  or  CamPosR) 
expressing  the  camera  location  in  FLANGE  coordinates. 
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A  matrix 

Name 

Const,  or  Var. 

Resulting  CS 

Ident 

LAB 

C 

LAB 

Ai 

Joint  1 

V 

A2 

Joint  2 

V 

A3 

Joint  3 

V 

A4 

Joint  4 

V 

As 

Joint  5 

V 

Ae 

Joint  6 

V 

7*6 

A7 

Tool 

V 

TOOL 

Ag 

FLANGE 

c 

FLANGE 

A9 

Neck  Offset 

c 

Aio 

Eye  X  Offset 

CL,CR 

CamAxis  (L,R) 

All 

Altitude 

V 

Ai2 

Alt.  Offset 

C 

Ai3 

Azimuth 

VL,  VR 

Ai4 

Az.  Offset 

_c _ 

CamPos  (L,R) 

Table  1:  The  LAB  to  camera  kinematic  chain  of  transforms. 

The  definition  of  distances  and  angles  for  the  robot  head  are  shown  in  Fig.  2.  See 
the  section  on  Constants  for  numeric  values. 

6.  Forward  Kinematics:  CamPos  from  (<|>,6) 

The  camera  motor  control  software  positions  a  camera  at  a  sp>ecific  altitude,  or  pitch 
( <1> )  and  azimuth,  or  yaw  (  0  ).  We  should  like  to  compute  CamPos  from  altitude  and 
azimuth.  CamPos  is  expressed  in  FLANGE  coordinates. 

CamPos=k()K\Q  •  •  •  A14. 

The  values  of  the  relevant  A  matrices  are  as  follows  (Fig.  2).  Constant  values  are  given 
in  Section  13. 

A  9  Trans(0,NECK_OFFSET,0). 

A 10  Trans(LEFT_OFFSET,0,0)  or  Trans(RIGHT_OFFSET,0,0). 

All  Rot_x(<j)). 

A 12  Trans(0,ALT_OFFSET,0). 

A 1 3  Rot_y (6/, )  or  Rot_y (0/? ). 

A 14  Trans(0,0,AZ_OFFSET). 

7.  Forward  Kinematics:  TOOL  from  Loc 

VAL  provides  several  useful  conversions.  A  "precision  point"  is  a  JntAng  vector, 
and  VAL  understands  robot  locations  in  both  precision  points  and  locations  (Loc 
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OFFSET 
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OFFSET 


-  LEFT 
OFFSET 


RIGHT 
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FLANGE  X 


Figxire  2:  Axes  and  link  offsets  in  the  robot  head  (see  Section  13). 

vectors).  It  of  course  is  possible  to  derive  (0,A,T)  simply  by  composing  all  the  joint 
angle  rotations.  Deriving  X.Y^  from  the  joint  angles  involves  knowing  offsets  and 
doing  a  full  forward  kinematics  solution.  Generally  then  the  T(X>L  location  is  most 
easily  obtained  from  VAL.  VAL  reports  the  TOOL  location  as  a  JntAng  vector  (a  VAL 
"precision  point")  and  a  Loc  vector  (a  VAL  "location").  Our  current  Purdue  robot 
control  software  only  returns  a  Loc,  although  that  is  subject  to  change. 

Converting  the  (0,A,T)  angles  to  a  transform  involves  knowing  exactly  what  they 
mean.  The  Puma  manual  is  not  explicit  here.  Ray  Rimey  determined  the  following 
transformation,  which,  in  its  alias  aspect,  moves  LAB  to  the  current  TOOL  CS. 

TOOULoc)  =  Rot_z(-90)Rot_y(90)Rot_x(-0)Rot_y(A)Rot_z(DTrans(X,Y,Z). 

This  transformation,  in  its  alibi  aspect,  thus  converts  points  from  TOOL  to  LAB 
coordinates.  It  is  written  out  explicitly  below.  If  the  TOOL  transform  is  redefined  by  the 
user,  then  it  is  to  that  redefined  CS  that  LAB  will  be  transformed.  Redefining  T(X>L 
with  a  VAL  command  means  setting  the  values  of  X,Y,Z,0,A,T  in  the  above 
transformation.  Thus  if  TOOL  is  redefined  within  VAL  from  NULLTOOL  to  FLANGE, 
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then  Ag  should  be  the  identity  transform,  and  can  vanish  from  the  kinematic  chain  and 
from  the  user’s  external  calculations. 

8.  Inverse  Kinematics:  (^,8)  from  CamPos 

Given  a  camera  position  expressed  as  a  CamPos  transform  in  FLANGE  coordinates, 
what  0  and  ^  angles  created  it?  To  find  out,  write  the  transform 

E=A9Aio  •  *  *  Ai4_ 

and  notice  that  certain  individual  elements  of  E  contain  exactly  the  sines  and  cosines  of  <}) 
and  0.  We  see 

^M/i_2(E2i,E]i), 

B=atan_  2(Eo2,E(x)). 

This  is  a  very  simple  version  of  the  work  needed  to  get  0,A,T  from  TOOL. 

9.  Inverse  Kinematics:  0,A,T  from  TOOL 

The  FRAME  command  in  VAL  takes  four  input  vectors  that  describe  the  TOOL 
axis  unit  vectors  and  origin,  and  returns  the  corresponding  Loc  vector  (X,Y,Z,0,A,T). 
The  interesting  part  of  this  is  of  course  deriving  (0,A,T)  from  the  TOOL  CS.  In  turn, 
this  is  an  operation  quite  closely  related  to  deriving  Euler  angles  from  a  transform,  which 
is  an  early  exercise  in  [Paul  81]. 

The  approach  is  to  multiply  the  five  matrices  from  the  TC  -L(Loc)  formula  (leaving 
out  the  translation)  together  to  get  a  TOOL  transform  D.  Say 

Z?=Bi  B2B3B4B5. 

Then,  postmultiply  both  sides  by  65^ ,  and  look  for  interesting  relationships  elementwise 
between  the  two  matrices.  In  this  case,  as  in  Paul’s  solution  for  Euler  angles,  we  find  that 
we  have  enough  information  to  compute  0,A,T  in  the  form  of  atan_2()  functions,  which 
have  good  propenies.  Proceeding  to  details,  use  the  notation  5^  for  sin  (A),  etc, 
substitute  A,-  for  B,  in  the  kinematic  chain  equation  above,  and  write  the  resulting 
product  of  the  first  four  B  matrices  as 

-SqSa  Co  SqCa  q 

_  CqSa  So  -CoCa  0 

-Ca  0  -S^  0  • 

0  0  0  1 

The  complete  transformation  is 

CoSt-SoSaCt  CoCt+SoSaSt  SoCa  q 
SoSt+CqSaCt  SoCt~CoSaSt  -CoCa  0 
-C^aCt  CaSt  -Sa  0 

0  0  0  1 


Section  7  establishes  that  Bi^=DB5^  =  D  Rot_2(-T).  Performing  the  fonnal 
multiplication  on  the  right  hand  side  yields  a  4x4  matrix  whose  elements  are  functions  of 


the  elements  of  D  (which  we  know),  Cj,  and  Sj.  Equatmg  these  elements  to  those  of 
we  find  our  first  interesting  equation: 

StD  'jQ-\rCjD  21  =0, 

which  implies 

T  =atan_2(P‘i\  20)- 

We  can  also  read  off  that 

-Sa=D  22 
and 

-Ca=CtD  2q-StD  21 , 
so 

A  =atan_  2(-D  22^5^0  21  -CjD  20). 

Finally,  we  can  read  off  expressions  for  Cq  and  So  to  get 

0=atan_  2(5^0  lo+Cj-D  n  .SjD  oo+CrZ)  01  )• 

At  the  time  of  writing,  these  derivations  have  not  been  checked  against  the  output  of 
the  FRAME  command. 


10.  Inverse  Kinematics:  (<(»,  0)  from  (x,y,z) 

Given  a  point'^at  (x,y,z,  1)^  in  FLANGE,  which  <|)  and  0  parameters  will  center  the 
point  in  a  camera’s  view?  Following  the  strategy  of  the  last  section  did  not  immediately 
lead  to  a  promising  set  of  equations.  The  hope  was  that  the  camera  physical  transform  E 
could  be  written  out  and  the  fact  that  £]?=(0,0,z,  1)^  would  lead  to  something  simple.  It 
did  not  seem  to.  Instead  we  use  a  straightforward  geometric  approach  (Fig.  3). 


From  Fig.  3,  we  have 


h={z^+y^)^'^ , 
b=asin{dlh), 
^=atan_2(-y,z)  -  b. 


The  asin()  is  bad  practice  because  of  its  ambiguity  and  lack  of  differentiation  for  angles 
near  90,  but  for  small  angles,  as  will  usually  be  the  case  in  such  a  setup  as  this,  it  behaves 
reasonably. 

It  remains  to  determine  the  azimuthal  rotation.  Rotate  space  by  Rot_x(-<(>),  bringing 
the  cameras  and  point  into  a  plane  of  constant  y.  The  point’s  new  (x,z)  coordinates 
become  (x,  zcos(<l>)-ysin(<j>)),  and  finally  we  have 

6=aran  2(jc,zcos(<!>)-ysm(<j>)). 


(b) 


Figure  3:  (a)  Distances  and  Angles  for  computing  the  0  to  aim  camera  at  a  point 
(b)  Distances  to  compute  6  to  aim  camera  at 

11.  Inverse  Optics:  (x,y^)  from  Two  Images  via  Pseudoinverse 
The  complete  imaging  model  is 

(x,y.z,  lf=P(norm  (CE?)), 

where  P  is  the  PhysPixel  transform  that  shifts  the  origin  of  pixel  coordinates  to  the  upper 


left  comer  from  the  center  of  the  image,  norm  ()  is  the  homogeneous  vector  normalizing 
operation,  C  is  the  imaging  matrix  given  above  in  the  Camera  Model  section,  and  T  is  the 
inverse  of  the  transform  that  locates  the  camera  in  LAB  coordinates,  i.e.  it  is 
(FLANGE  CamPosT^ .  First,  let 

(i,y,z,l)^=P“‘(i.i',z,l). 

(In  our  system  this  just  amounts  to  defining  the  new  variables  ic=x-255,;y=y-255  ).  Then 
from  the  definition  of  norm  (),  and  letting ■j?=(jt,y,z,  1)^  we  have 

[CT1(^?  [CTlil? 

(jc,y)  =  (-!— - ^), 

[CTl3^ 

where  [CTTjo  is  the  first  row  of  CT,  etc.  Moving  the  denominators  over  to  the  left  gives 
us 

5[Cr]3j?=[CTlo? 

y[CTl3-j?=[CT],t 

and  multiplying  everything  out  and  rearranging  gives  two  linear  equations  in  x,y,  and  z  in 
terms  of  the  known  quantities /the  effective  focal  length,  s  the  aspect  ratio,  and  Ty. 

J^(^T2o-/TooH>’(5T2i-/roiKz(5T22-/To2)  =/To3-xT23 

(yT2o-/5Tio)+y  (yT2i -/sTii  )+z(yT22-/^Ti2)  = /rTi3-yT23 . 

Thus  knowing  the  physical  locations  of  two  cameras,  and  knowing  the  pixel  coordinates 
of  the  corresponding  two  images  of  the  same  three-dimensional  point  it  yields  four 
equations  in  the  three  unknowns  (x,y,z).  They  can  be  solved  by  a  pseudo-inverse 
method.  If  X  is  the  matrix  of  coefficients  of  (x,y,z)  in  the  above  equation  and  Y  is  the 
row  matrix  of  the  right  hand  sides,  then  the  four  equations  can  be  written 

Y=XB 

if  B  is  the  formal  column-vector  of  the  variables  (x,y,z)^.  The  values  of  x,y,  and  z  are 
obtained  simply  by  computing  the  pseudo  inverse  of  X: 

B=(X^Xr*X^Y. 

The  physical  interpretation  of  this  method  is  made  difficult  by  the  fact  that  the 
"observables"  (the  x  and  y)  and  the  "independent  variables"  (the  Ty)  contribute  to 
coefficients  of  both  the  B  matrix  and  Y  vector.  Analysis  shows  that  the  effect  of  noise  on 
this  method  may  be  significant,  since  a  one-pixel  error  in  x  position  causes  a  2()mm  depth 
error  at  a  two  meter  distance,  and  a  one  degree  error  in  azimuth  produces  a  2()mm  error 
in  X  location.  The  method  has  been  implemented  and  integrated  into  a  system  that 
obtains  three-dimensional  position  and  verifies  it  by  touching  the  object  with  a  pointer, 
and  seems  to  perform  as  well  as  the  more  geometrically  intuitive  method  given  in  the 
next  section.  One  potential  advantage  of  the  pseudoinverse  method  is  its  straightforward 
extension  to  more  data  points. 
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12.  Inverse  Optics:  (x,y^)  from  Two  Images  via  Vectors 

Duda  and  Han  [1973]  present  a  geometrically  intuitive  method  for  stereo  from  two 
image  points.  We  have  implemented  it  and  find  it  works  as  well  as  the  pseudoinverse 
method  for  two  images.  The  stereo  problem  is  posed  using  plain  3-vectors,  not 
homogeneous  vectors.  The  following  vectors  are  defined  (Fig.  4). 

The  two  camo^as  have  lens  centers  at  Li  and  L^.  The  vector  o  points  from  Li  to 
Lr.  The  3-D  point  in  the  scene  isl?  l^is  imaged  in  cameraL  aslti={xi  yiY,  and  in 
cameraR  it  is  imaged  intol^.  The  vector!^  points  from  the  lens  center  of  cameraL 
through  the  point  3^  and  to  the  3-D  point  in  the  scene.  A  unit  vector  in  the  same 
direction  isT^.  Similarly,  cameraR  hasl^  zndltR  defined. 


A  temporary  world  coordinate  system  is  placed  at  L^,  thus  Li  -  0.  (Once  the  3-D  point 
is  estimated  in  this  coordinate  system,  we  will  generally  convert  it  to  another  system  such 
as  FLANGE.) 

The  vectorsT^  andT^  are  defined  as  follows 


Figure  4.  Vectors  Used  in  Two-Vector  Stereo  Fonnulation. 
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7^=a/f74+^- 

The  approach  is  to  estimate  the  two  scales  and  an  such  that  theT^  andT^  vectors  are 
as  close  together  as  possible.  Note  that  o  is  known  and  thatl^  andl^  can  he  computed 
ftomli^  and"^.  Then  the  value  ofj?can  be  computed  from  ql  and  Qr  as  the  point 
midway  between  the  heads  of  thel^  andT^  vectors 

The  values  of  ai  and  are  estimated  by  minimizing 

I  1^. 

Duda  and  Hart  give  the  answer  to  the  minimization  as 

+7^  ^R)(itn 


aL=- 


a/t=- 


A  few  ancilliary  equations  are 


7?/=- 


l-ai-7?/f)' 

^(jti  itn  )(jti 

HLk/sff 


7?p=- 


HxLkfsfyw 

ik  kfs  ff 
^(.kyRfsff\  I  ’ 


where  s  is  the  pixel  aspect  ratio.  Given  the  positions  of  the  Left  and  Right  cameras  in 
FLANGE  coordinates  (call  these  CamPos  transfonnations  CamPosL  and  CamPosR),  o 
may  be  computed  as 

~t=^n=CamPosL~^  CamPosR  (00  0  1)^. 

If  the  cameras  do  not  have  parallel  opdc  axes,  then  the  transformation 
(CamPosL~^ CamPosR)  must  be  applied  to 7^  to  rotate  it  with  respect  to  u^. 

13.  Lab  Constants 

These  values  are  taken  from  the  directory  /u/brown/robotAnclude,  where  there  are 
several  files  of  the  form  Xconsts.h. 


13.1.  Head  Constants 

NECK_OFFSET  (-149.2) 

LEFT_OFFSET  (-12.7) 

RIGHT_OFFSET  (152.4) 

ALT_OFFSET  (-65.1) 

AZ.OFFSET  (34.9) 


/♦pitch  axis  to  tool  axis*/ 

/♦tool  Z  axis  to  Left  camera  yaw  axis*/ 
/♦tool  Z  axis  to  Right  camera  yaw  axis*/ 
/♦cam  axis  to  pitch(altitude)  axis  */ 
/♦nodal  point  to  yaw(azimuth)  axis  */ 


13.2.  Camera  Constants 


16 


CAM_F 
CAM  S 


980.5 

1.289 


/*imaging  system  effective  focal  length*/ 
/♦imaging  system  pixel  aspect  ratio  y/x*/ 


/*  the  following  constants  allow  computation  of  how  many  pixels  to  move  for  a 
corresponding  change  in  angle,  and  vice-versa.  They  are  accurate  near  image  center,  off 
by  a  few  pixels  in  the  periphery  due  to  failure  of  small-angle  assumption.  The  focus 
distance  at  which  they  arc  computed  is  the  ’’standard  focus  distance  "  of  134cni,  by  RP 
and  CB  on  5  July.  The  CAM_F  above  also  applies  at  this  distance.  */ 


CAM_X_P_D 
CAM  Y  P  D 


13.3.  Robot  Constants 

INIT_X 
INIT_Y 
INIT_Z 
INTT.O 
INIT_A 
INTT  T 


17.75 

22.79 


/♦pixels  per  degree  in  x  */ 
/♦pixels  per  degree  in  y  ♦/ 


650.0 

190.13 

975.0 

90.0 

(-90.0) 

0.0 


/♦init  xyzoat  when  all  joint  angles  0  */ 


14.  Rigid  Transformation  Library 

Edmond  Lee  wrote  a  library  for  manipulating  rigid  transforms  as  an  extension  to 
libmatrix,  based  on  an  calier  column-vector  version  by  Dave  Ctoombs.  It  provides  basic 
and  efficient  implementations  of  standard  data  structures  and  operations.  Following  are 
exerpts  from  its  header  file. 


typedef  struct  matrix  ♦put; 
typedef  struct  matrix  *tr_t; 

pt_t  pt_zero(); 

pt_t  pt_rDtx(/^pt_t  p,  double  r*/); 

pt_t  pt_roty^^pt_t  p,  double  t*/); 

pt_t  pt_rotz()^pt_t  p,  double  r*/); 

pt_t  pt_translate(/^pt_t  p,  double  x,y,z^/); 

pt_t  pt_norm(/^pt_t  p*/); 

pt_t  pt_transform(/*pt_t  p,  tr_t  T,  pt_t  q^/); 

tr_t  trJdentO; 

tr_t  tr_rotx(/^tr_t  A,  double  r*/); 
tr_t  tr_roty(/^tr_t  A,  double  t*/)-, 
tr_t  tr_ix)tz^^tr_t  A,  double  r^/); 
tr_t  tr_translatc(/^tr_t  A,  double  x,y,z*/); 
tr_t  tr_invert(/^tr_t  A,  B*f); 


I*  homog  point  (4x1  col.  vector)  ♦/ 

/♦  homog  transformation  (4x4  matrix)  ♦/ 

/♦  Return  new  point  0  0  0  1  ♦/ 
/♦sideeffccts  p,  rot  about  X  by  r  ♦/ 


/*  sideeffccts  p  by  normalizing  it  ♦/ 
/*q  =  Tp.  returns  q^/ 

/*  returns  Identity  transform  ♦/ 
/♦sideeffccts  A,  rot  about  X  by  i*/ 


/*  B  is  A  inverted,  B  returned  ♦/ 
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tr_t  tr_transform(/*tr_t  A,  tr_t  B,  tr_t  C  */);  /*  C  =  A*B,  C  returned  */ 


15.  Sample  Robot  Data  Structures  and  Functions 

Data  structure  requirements  for  applications  vary,  but  the  following  sort  of 
structures  have  been  successfully  used  for  various  vision  tasks,  and  may  serve  as  a 
template  for  future  (or  possibly  even  standard)  robot  and  head  data  structures,  access 
functions,  and  update  funcdons. 

/♦ - •/ 

/*  intrinsic  camera  properties*/ 

typedef  struct  timeval  *timestamp_t; 


typedef  struct  Camera 
{ 


int 

verbose 

timestamp_t 

Ca_Time; 

/♦time  last  modified*/ 

double 

Ca_Focus_Dist; 

double 

Ca_Fstop; 

PU 

Phys_to_Pixel; 

/*x  and  y  shifts  for  phys  to  pix  coords.  */ 

tr_t 

Ca_Lens; 

/*C  matrix  containing  f  and  sf  */ 

}  *Camera_t; 

/* . . 

- */ 

/*  Head  geometry  */ 

typedef 

{ 

int 

struct  Head_Config 

/♦uses  ensts  in  headconsts.h,  camconsts.h  */ 

verbose; 

iimestamp_t 

Hd_Time; 

double 

Hd_Alt; 

/♦head  altitude  angle  (pitch)  •/ 

Pt_t 

Hd_Nosc; 

/*  offset  of  end  of  nose  (FLANGE  coords)  */ 

- ♦/ 

double 

Hd_AzL; 

/♦left  camera  azimuth*/ 

PU 

Hd_CamL_Axis; 

/*  This  vector  holds 

the  neck  and  the  camera  yaw  axis  offset,  specifying  offset  from 

FLANGE  origin  to 

last  rigid  point  in 

head  kinematic  chain.  Useful  intermediate 

transform  if  only  eyemovements  are  happening.  */ 

tr_t 

Hd_CamL_Pos; 

/*  CamPos:  camera’s  position  in  FLANGE.*/ 

tr_t 

Hd_CamL_Inv; 

/*  Inverse  of  CamL_Ptos  */ 

Camera_t 


Hd_CamL-^*Left  Camera  state*/ 


Right  Camera  is  Similar 


/* 

double 

PU 

tr_t 

tr_t 

Camera_t 
)  ♦Head_Config_t; 


Hd_AzR; 

Hd_CamR_Axis; 

Hd_CamR_Pos; 

Hd_CamR_Inv; 

Hd_CamR; 


*/ 


/* - */ 

/*  Puma  Geometry  */ 

#define  PUPDUE  0 
#define  TYPE  1 
#define  SIM  - 1 


typedef  struct  Rob_Config 


int 

int 

int 

timestamp_t 


verbose; 

Rob_man_sim;r 

Rob_ddAlvin; 

Rob_Time; 


/*  purdue,  console,  or  simulated*/ 
/*purdue  puma  device  descriptor*/ 


double 

double 

double 


Rob_Speed[2]; 

Rob_Jnt[6]; 

Rob_Location[6]; 


/*[0]  is  speed,  [1]  is  mode  */ 

/* Joint  angles  of  Joints  1-6  in  order*/ 


/*Current  TOOL  location  as  X,Y,Z,0,A,T.  Updated  as  Robot  moves.  */ 


double  Rob_Tool[6]; 

/*Current  TOOL  transform  as  X,Y,Z,0,A,T.  Set  by  user,  same  as  A  7,  defines  TOOL  in 
relation  to  does  not  move  with  robot.  */ 

tr_t  Rob_FLANGE;  /*transfonn  to  move  LAB  to  FLANGE  */ 

tr_t  Rob_Fl_Inv;  /*inversc  of  above  */ 

Head_Config_t  Rob_Head;  /♦Head  configuration  struct*/ 


)  *Rob_Config_t; 

/* - Functions - */ 

/*rob_kine.c  */ 
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extern  tr_t  Loc_To_FLANGE  ( /*  Loc  (xyzoat)  array  ♦/); 

I*  Returns  a  CS  for  the  Rob_FLANGE  entry  in  the  Config  */ 

extern  Rob_Config_t  Rob_Setup(/*simulation  type*/); 

/♦Creates  Rob  config,  also  calls  head  config  setup  */ 

extern  void  Rob_Set_Tool(/*Loc*/); 

/♦  Set  the  TCXDL  CS  to  be  that  represented  by  Loc  ♦/ 

extern  void  Rob_Free(  /*Rob_Coiifig_t  ♦/);  /*destToys  structure*/ 

extern  void  Rob_Move(/*  Rob_Config_t,  Rob_Loc  */); 

/♦Depending  on  riKxie,  moves  robot  or  not.  Updates  structures.*/ 

extern  void  Rob_Dump();  /*print  robot  state  ♦/ 


/*head_kine  .c  */ 


extern  Head_Config_tHead_Setup(); 

/♦  Creates  the  head_config,  puts  in  pre-computable  tr_t’s  ♦/ 

extern  void  Head_Move(/*  Head_Config_t,  Left_Az,  Right_Az,  Alt*/); 

/♦computes  L  and  R  Pos  and  Cam  matrices  in  FLANGE  coords,  trjoves  or  not  depending 
on  rolwt  mode.  ♦/ 


extern  void 
extern  void 
extern  void 
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